Multiword Expression Recognition

نویسندگان

  • Anoop Kunchukuttan
  • Om Damani
چکیده

In the recent past, the important role played by multiword expressions in the language has been recognized by the natural language processing community. Simply put, a multiword expression (MWE) is a word collocation that exhibits markedly peculiar linguistic behaviour in terms of lexicalization, syntax or semantics. Among others, ubiquitous compound nouns, idioms and phrasal verbs fall into this category. This has led to efforts in handling multiword expressions as a linguistic phenomenon to be explained through language formalism. We survey the multiword expression landscape to understand their characteristics. Extracting MWEs automatically in unrestricted text is important to furthering development of language theories on MWEs and proper handling of MWEs by NLP applications. We survey approaches to extraction and identification of MWEs in running text, as a precursor to exploring better ways to extract MWEs, especially for a domain restricted corpus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative Strategies to Integrate Multiword Expression Recognition and Parsing

The integration of multiword expressions in a parsing procedure has been shown to improve accuracy in an artificial context where such expressions have been perfectly pre-identified. This paper evaluates two empirical strategies to integrate multiword units in a real constituency parsing context and shows that the results are not as promising as has sometimes been suggested. Firstly, we show th...

متن کامل

Web Based Manipuri Corpus for Multiword NER and Reduplicated MWEs Identification using SVM

A web based Manipuri corpus is developed for identification of reduplicated multiword expression (MWE) and multiword named entity recognition (NER). Manipuri is one of the rarely investigated language and its resources for natural language processing are not available in the required measure. The web content of Manipuri is also very poor. News corpus from a popular Manipuri news website is coll...

متن کامل

MULTILINGUAL MULTIWORD EXPRESSIONS Literature Survey

Multiword Expressions are idiosyncratic word usages of a language which often have noncompositional meaning. The knowledge of multiword expressions is necessary for many NLP tasks like, machine translation, natural language generation, named entity recognition, sentiment analysis etc. In order for other NLP applications to benefit from the knowledge of multiword expressions, they need to be ide...

متن کامل

Accounting for Contiguous Multiword Expressions in Shallow Parsing

In this paper, we focus on chunking including contiguous multiword expression recognition, namely super-chunking. In particular, we present different strategies to improve a superchunker based on Conditional Random Fields by combining it with a finite-state symbolic super-chunker driven by lexical and grammatical resources. We display a substantial gain of 7.6 points in terms of overall accuracy.

متن کامل

Strategies for Contiguous Multiword Expression Analysis and Dependency Parsing

In this paper, we investigate various strategies to predict both syntactic dependency parsing and contiguous multiword expression (MWE) recognition, testing them on the dependency version of French Treebank (Abeillé and Barrier, 2004), as instantiated in the SPMRL Shared Task (Seddah et al., 2013). Our work focuses on using an alternative representation of syntactically regular MWEs, which capt...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007